Subset Seed Extension to Protein BLAST
نویسندگان
چکیده
A bstract: The seeding technique became central in the theory of sequence alignment and there are several efficient tools applying seeds to D N A homology search. Recently, a concept of subset seeds has been proposed for similarity search in protein sequences. We experimentally evaluate the applicability of subset seeds to protein homology search. We advocate the use of multiple subset seeds derived from a hierarchical tree of amino acid residues. Our method computes, by an evolutionary algorithm, seeds that are specifically designed for a given protein family. The representation of seeds by deterministic finite automata (D FA s) is developed and built into the N C B I-B L A ST software. This extended tool, named SeedB L A ST, is compared to the original N C B I-B L A ST and PSI-B L A ST on several protein families. Our results demonstrate a superiority of SeedB L A ST in terms of efficiency, especially in the case of twilight zone hits. SeedB L A ST is an open source software freely available http://bioputer.mimuw.edu.pl/papers/sblast. Supplementary material and user manual are also provided.
منابع مشابه
Effect of seed pre-soaking on compensation of late planting of two forage sorghum (Sorghum bicolor (L.) Moench) cultivars in second cropping
To evaluate the effect of seed pre-soaking on forage yield and quality and water productivity in late planting of two forage sorghum cultivars, a field experiment was conducted was conducted as split factorial arrangements in randomized complete block design with three replications in 2017 and 2018 growing seasons at the research field of Seed and Plant Improvement Institute, Karaj, Iran. Four ...
متن کاملMercury BLASTN: Faster DNA Sequence Comparison using a Streaming Hardware Architecture
Motivation: Large-scale DNA sequence comparison, as implemented by BLAST and related algorithms, is one of the pillars of modern genomic analysis. One way to accelerate these computations is with a streaming architecture, in which processors are arranged in a pipeline that replicates the multistage structure of the algorithm. To achieve high performance, the processor hardware implementing the ...
متن کاملThe reaction of 109 rice lines to blast disease
Shahbazi H, Tarang A, Padasht F, Hosseini Chaleshtari M, Allah-Gholipour M, Khoshkdaman M, Mousavi Qaleh Roudkhani SA, Nazari Tabak S, Asadollahi Sharifi F, Pourabbas Dolatabad M (2022) The reaction of 109 rice lines to blast disease. Plant Pathology Science 11(1):24-35. Doi: 10.2982/PPS.11.1.24. Introduction: Blast caused by Pyricularia oryzae is the most important fungal disease of ri...
متن کاملImpact of Storage Fungi on Soybean Seed Deterioration in Different Storage Conditions and Seed Moisture Content
DOR: 98.1000/2383-1251.1398.6.65.11.1. 1578.1585 Extended Abstract Introduction: Understanding the complex characteristics that control the life span of the seed has ecological, agricultural and economic importance. Inappropriate storage conditions after harvesting destroy a large part of annual yield partly due to microbial activity in the storage. Damage from storage fungi varies based ...
متن کاملThe Use of Vector Seeds to Improve PSI - BLAST Sensitivity
PSI-BLAST [5] is widely used for searching protein databases for sequence similarities, especially distant homologies. Position specific score matrices are constructed during its running. These give the program the power to capture remote relations of a query sequence. PSI-BLAST needs multiple iterations in most circumstances and is time-consuming. Here, we modified the vector seed optimizing a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011